1,453 research outputs found

    Improving Data Integration through Disambiguation Techniques

    Get PDF
    In this paper Word Sense Disambiguation (WSD) issue in the context of data integration is outlined and an Approximate Word Sense Disambiguation approach (AWSD) is proposed for the automatic lexical annotation of structured and semi-structured data sources

    Building an Urban Theft Map by Analyzing Newspaper Crime Reports

    Get PDF
    One of the main issues in today's cities is related to public safety, which can be improved by implementing a systematic analysis for identifying and analyzing patterns and trends in crime also called crime mapping. Mapping crime allows police analysts to identify crime hot spots, moreover it increases public confidence and citizen engagement and promotes transparency.This paper is focused on analyzing and mapping thefts through on-line newspaper using text mining techniques for an Italian city

    High-level visualization over big linked data

    Get PDF
    The Linked Open Data (LOD) Cloud is continuously expanding and the number of complex and large sources is raising. Understanding at a glance an unknown source is a critical task for LOD users but it can be facilitated by visualization or exploration tools. H-BOLD (High-level visualization over Big Open Linked Data) is a tool that allows users with no a-priori knowledge on the domain nor SPARQL skills to start navigating and exploring Big Linked Data. Users can start from a high-level visualization and then focus on an element of interest to incrementally explore the source, as well as perform a visual query on certain classes of interest. At the moment, 32 Big Linked Data (with more than 500.000 triples) exposing a SPARQL endpoint can be explored by using H-BOLD

    Providing effective visualizations over big linked data

    Get PDF
    The number and the size of Linked Data sources are constantly increasing. In some lucky case, the data source is equipped with a tool that guides and helps the user during the exploration of the data, but in most cases, the data are published as an RDF dump through a SPARQL endpoint that can be accessed only through SPARQL queries. Although the RDF format was designed to be processed by machines, there is a strong need for visualization and exploration tools. Data visualizations make big and small linked data easier for the human brain to understand, and visualization also makes it easier to detect patterns, trends, and outliers in groups of data. For this reason, we developed a tool called H-BOLD (Highlevel Visualization over Big Linked Open Data). H-BOLD aims to help the user exploring the content of a Linked Data by providing a high-level view of the structure of the dataset and an interactive exploration that allows users to focus on the connections and attributes of one or more classes. Moreover, it provides a visual interface for querying the endpoint that automatically generates SPARQL queries

    A Visual Summary for Linked Open Data sources

    Get PDF
    In this paper we propose LODeX, a tool that produces a representative summary of a Linked open Data (LOD) source starting from scratch, thus supporting users in exploring and understanding the contents of a dataset. The tool takes in input the URL of a SPARQL endpoint and launches a set of predefined SPARQL queries, from the results of the queries it generates a visual summary of the source. The summary reports statistical and structural information of the LOD dataset and it can be browsed to focus on particular classes or to explore their properties and their use. LODeX was tested on the 137 public SPARQL endpoints contained in Data Hub (formerly CKAN), one of the main Open Data catalogues. The statistical and structural information of the 107 well performed extractions are collected and available in the online version of LODeX (http://dbgroup.unimo.it/lodex)

    LODeX: A tool for Visual Querying Linked Open Data

    Get PDF
    Formulating a query on a Linked Open Data (LOD) source is not an easy task; a technical knowledge of the query language, and, the awareness of the structure of the dataset are essential to create a query. We present a revised version of LODeX that provides the user an easy way for building queries in a fast and interactive manner. When a user decides to explore a LOD source, he/she can take advantage of the Schema Summary produced by LODeX (i.e. a synthetic view of the dataset’s structure) and he/she can pick graphical elements from it to create a visual query. The tool also supports the user in browsing the results and, eventually, in refining the query. The prototype has been evaluated on hundreds of public SPARQL endpoints (listed in Data Hub) and it is available online at http://dbgroup.unimo.it/lodex2. A survey conducted on 27 users has demonstrated that our tool can effectively support both unskilled and skilled users in exploring and querying LOD datasets

    Online Index Extraction from Linked Open Data Sources

    Get PDF
    The production of machine-readable data in the form of RDF datasets belonging to the Linked Open Data (LOD) Cloud is growing very fast. However, selecting relevant knowledge sources from the Cloud, assessing the quality and extracting synthetical information from a LOD source are all tasks that require a strong human effort. This paper proposes an approach for the automatic extraction of the more representative information from a LOD source and the creation of a set of indexes that enhance the description of the dataset. These indexes collect statistical information regarding the size and the complexity of the dataset (e.g. the number of instances), but also depict all the instantiated classes and the properties among them, supplying user with a synthetical view of the LOD source. The technique is fully implemented in LODeX, a tool able to deal with the performance issues of systems that expose SPARQL endpoints and to cope with the heterogeneity on the knowledge representation of RDF data. An evaluation on LODeX on a large number of endpoints (244) belonging to the LOD Cloud has been performed and the effectiveness of the index extraction process has been presented
    • …
    corecore